Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality

نویسندگان

Mark Lillibridge

Kave Eshghi

Deepavali Bhagwat

Vinay Deolalikar

Greg Trezis

Peter Camble

چکیده

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud

Eliminating duplicate data in primary storage of clouds increases the cost-efficiency of cloud service providers as well as reduces the cost of users for using cloud services. Most existing primary deduplication techniques either use inline caching to exploit locality in primary workloads or use postprocessing deduplication running in system idle time to avoid the negative impact on I/O perform...

متن کامل

iDedup: latency-aware, inline data deduplication for primary storage

Deduplication technologies are increasingly being deployed to reduce cost and increase space-efficiency in corporate data centers. However, prior research has not applied deduplication techniques inline to the request path for latency sensitive, primary workloads. This is primarily due to the extra latency these techniques introduce. Inherently, deduplicating data on disk causes fragmentation t...

متن کامل

A Scalable Inline Cluster Deduplication Framework for Big Data Protection

Cluster deduplication has become a widely deployed technology in data protection services for Big Data to satisfy the requirements of service level agreement (SLA). However, it remains a great challenge for cluster deduplication to strike a sensible tradeoff between the conflicting goals of scalable deduplication throughput and high duplicate elimination ratio in cluster systems with low-end in...

متن کامل

ChunkStash: Speeding Up Inline Storage Deduplication Using Flash Memory

Storage deduplication has received recent interest in the research community. In scenarios where the backup process has to complete within short time windows, inline deduplication can help to achieve higher backup throughput. In such systems, the method of identifying duplicate data, using disk-based indexes on chunk hashes, can create throughput bottlenecks due to disk I/Os involved in index l...

متن کامل

A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique

Data Deduplication describes approach that reduces the storage capacity needed to store data or the data has to be transfer on the network. Cloud storage has received increasing attention from industry as it offers infinite storage resources that are available on demand. Source Deduplication is useful in cloud backup that saves network bandwidth and reduces network space Deduplication is the pr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality

نویسندگان

چکیده

منابع مشابه

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud

iDedup: latency-aware, inline data deduplication for primary storage

A Scalable Inline Cluster Deduplication Framework for Big Data Protection

ChunkStash: Speeding Up Inline Storage Deduplication Using Flash Memory

A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique

عنوان ژورنال:

اشتراک گذاری